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Abstract 

The minimal vertex-cover (or maximal independent-set) problem is studied on ran- 
dom graphs of finite connectivity. Analytical results are obtained by a mapping to 
a lattice gas of hard spheres of (chemical) radius one, and they are found to be in 
excellent agreement with numerical simulations. We give a detailed description of the 
replica-symmetric phase, including the size and the entropy of the minimal vertex covers, 
and the structure of the unfrozen component which is found to percolate at connectivity 
c ~ 1.43. The replica-symmetric solution breaks down at c = e ~ 2.72. We give a simple 
one-step replica symmetry broken solution, and discuss the problems in interpretation 
and generalization of this solution. 

1 Introduction 

The last few years have seen an increasing interest of theoretical computer scientists, mathe- 
maticians and, more recently, of statistical physicists in random combinatorial optimization 
and decision problems, see e.g. [El |^]. Traditional complexity theory J3| characterizes combi- 
natorial problems with respect to the worst-case dependence of solution times for algorithms 
on the problem size, or, more precisely, on the memory size needed to encode a problem. 
Some of the most challenging problems are collected in the class of NP-complete problems: 
In such problems a potential solution can be verified (or falsified) very effectively in poly- 
nomial time, whereas the search for a solution among the exponential number of candidates 
becomes very slow due to entropic reasons. The completeness property refers to the fact 
that once an effective, i.e. polynomial algorithm is found for any NP-complete problem, it 
can be modified to effectively solve every other such problem. The question whether or not 
such algorithms can be constructed is however still open, and belongs to the important open 
questions of modern mathematics. Famous members of the class of NP-complete problems 
are e.g. Boolean satisfiability, number partitioning, vertex cover, or the traveling-salesman 
problem. 

This worst-case classification does however give no information on typical solution times. 
For almost ten years now, randomized optimization and decision problems have been studied 
therefore, for an overview see the special issues [[j], |J. It was realized, that exponentially 
longest solution times typically appear when the problems are situated at phase boundaries 
and therefore critically constraint [|| . 
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Due to the analogy between such combinatorial optimization problems and statistical- 
mechanics models with discrete degrees of freedom at low temperature, many methods de- 
veloped in physics can be applied directly to theoretical computer science. This was done 
e.g. for Boolean satisfiability d, ^, 0, |L for number partitioning ||, the traveling-salesman 
problem (To), for Euclidean matching [11J and recently also for vertex cover Also the 
relations between phase transitions and the appearance of hardest instances was recently 
analyzed for specific algorithms using statistical-mechanics methods jl3[ |l4j| . 

In this paper, we give a detailed description of the statistical mechanics approach to 
minimal vertex covers on finite connectivity random graphs. For this reason, the model will 
be mapped to a random lattice gas of hard spheres of radius one. 

The plan of the paper is the following. In the next section we define the model and 
give an overview over some rigorously known results. In section || the model is shown to 
be equivalent to a hard-sphere lattice gas. Section || explains the numerical methods used 
to check the analytical results. The latter are based on the replica approach presented in 
sections ^-0, starting with a general calculation of the replicated free energy. In section ^, 
the most important results are presented: the size, entropy and structure of minimal vertex 
covers are described in a replica-symmetric approach, whereas the simplest one-step replica 
symmetry broken solution is explained in section [7| The paper is closed by a concluding 
section. Several technical details are delegated to three appendices. 



2 The model 

In this section, we will introduce the terminology and some rigorously known results about 
vertex cover and related problems. 



2.1 Vertex cover and related problems 

Let us start with the definition of vertex covers. We consider a graph G — (V, E) with N 
vertices i £ {1,2,..., N} and undirected edges {i, j} S E C V xV connecting pairs of vertices. 
Please note that {i,j} and {j,i} both denote the same edge. 

Definition 1: A vertex cover V vc is a subset V vc C V of vertices such that for all edges 
€ E at least one of the endpoints is in V vc , i.e. i € V vc or j S V vc . 

Later on also subsets V vc are considered, which are not covers. Anyway, we call all vertices 
in V vc covered, all others uncovered. Also edges from En (V vc x VUVx V vc ) are called covered. 
This means that if V vc is a vertex cover, all edges are covered. 

We will study the minimal vertex-cover problem, which consists in finding a vertex cover 
V vc of minimal cardinality, and calculate the minimal fraction x c {G) = \V VC \/N needed to 
cover the whole graph. 

This problem is equivalent to other optimization problems: 

• An independent set is a subset of vertices which are pairwise disconnected in the graph 
G. Due to the above-mentioned properties, any set V \ V vc thus forms an independent 
set, and maximal independent sets are complementary to minimal vertex covers. 

• A clique is a fully connected subset of vertices, and thus an independent set in the 
complementary graph G where vertices i and j are connected whenever j} ^ E and 
vice versa. 
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2.2 Random graphs 



In order to speak of median or average cases, and of phase transitions, we have to introduce 
a probability distribution over graphs. This can be done best by using the concept of random 
graphs as already introduced about 40 years ago by Erdos and Renyi p|. A random graph 
Gn.p is a graph with N vertices V = {1, N}, any pair of vertices is connected randomly 
and independently by an edge with probability p. So the expected number of edges becomes 
p(^) — pN 2 /2 + O(N), and the average connectivity of a vertex equals p(N — 1). 

The regime we are interested in are however finite- connectivity graphs having p = c/N 
with constant c in the large- N limit. Then the average connectivity c+0(N~ 1 ) stays finite. In 
this case, we also expect the size of minimal vertex covers to depend only on c, x c (G) = x c (c) 
for almost all random graphs Gjv.c/jv- 

Here we want to review shortly some of the fundamental results on random graphs which 
were already described in fl5| , and which are important for the following sections: 

The first point we want to mention is the distribution of connectivities (or vertex degrees) 
d, in the limit N — > oo it is given by a Poisson-distribution with mean c: 

(c) d 

p ° c{d) = e dT' (1) 

A second point which is important for the understanding of the following is the component 
structure. For c < 1, i.e. if the vertices have in average less than one neighbor, the graph 
Gn.c/n is built up from connected components containing up to O(lnA^) vertices. The 
probability that a component is a specific tree Tk of k vertices is given by 

( r )k-l 

p(k)=e- ckK -L r , (2) 

and is equal for all k k ~ 2 distinct trees. As the fraction of vertices which are collected in finite 
trees is YlkLi p(k)k h ~ 2 k = 1 for all c < 1, in this case almost all vertices are collected in such 
trees. For c > 1 a giant component appears which contains a finite fraction of all vertices, 
c = 1 is therefore called the percolation threshold. 

For a complete introduction to random graphs see the book by Bollobas 111] . 



2.3 Rigorously known bounds 

In this subsection we are going to present some previously known rigorous bounds on x c (c). A 
general one for arbitrary, i. e. non-random graphs G was given by Harant fl7|| who generalized 



an old result of Caro and Wei 18 . Translated into our notation, he showed that 



*-(G)<i-4 ^ eVrf ' 1+1 - (d d)2 <;n 

y '- n sr -J: i^-dj) 2 

L-ii£V di + l £- l (i,j)£E (d i +l)(dj+l) 

where di is the connectivity (or degree) of vertex i. Using the distribution (Q) of connectivities 
and its generalization to pairs of connected vertices, this can easily be converted into an upper 
bound on x c (c) which holds almost surely for N — > oo. 

The vertex cover problem or the above-mentioned related problems were also studied in 
the case of random graphs, and even completely solved in the case of infinite connectivity 
graphs, where any edge is drawn with finite probability p, such that the expected number of 
edges isp(^) = 0(N 2 ). There the minimal VC has cardinality (iV-21n 1/(1 _p) N-OQnhiN)) 



3 



|19|] . Bounds in the finite-connectivity region of random graphs with N vertices and cN edges 
were given by Gazmuri ^o|. He showed that 

xi(c) < x c (c) < 1 - — (4) 
c 

where the lower bound is given by the unique solution of 

= xi{c) Inxi(c) + (1 - asi(c)) ln(l - x,(c)) - |(1 - xi{c)) 2 . (5) 

This bound coincides with the so-called annealed bound in statistical physics. The correct 
asymptotics for large c was given by Frieze [ pl| : 

2 1 
x c (c) = 1 - -fine- lnlnc+ 1 - In 2) + o(-) (6) 
c c 

with corrections of o(l/c) decaying faster than 1/c. 

3 Equivalence to a hard-sphere lattice gas 

After having introduced the problem in mathematical terms, we are now going to connect it 
to a statistical-mechanics model, more precisely to a lattice gas of hard spheres of chemical 
radius one. 

Any subset U C V of the vertex set can be encoded by a configuration of N binary 
variables: 

' if i G U , 
1 if i<£U { ) 

The strange choice of setting Xi to zero for vertices in U becomes clear if we look at the vertex 
cover constraint: An edge is covered by the elements in U iff at most one of the two end- 
points has x = 1. So the variables Xi can be interpreted as occupation numbers of vertices 
by the center of a particle. The covering constraint translates into a hard sphere constraint: 
If a vertex is occupied, i. e. Xi = 1, then all neighboring vertices have to be empty. We thus 
introduce an indicator function 

x(xi,...,x N ) = Y[ (8) 

which is one whenever x = (x\, xn) corresponds to a vertex cover, and zero else. Having 
in mind this interpretation, we may write down the grand partition function 

S= ^2 explfx^Xij x{x) (9) 

{zi=0,l} V i / 

with /i being a chemical potential which can be used to control the particle number, or the 
cardinality of U. 

For regular lattices, this model is well studied as a lattice model for the fluid-solid transi- 
tion, for an overview and the famous corner-transfer matrix solution of the two-dimensional 
hard-hexagon model by R. Baxter see his book |22| . Recently, lattice-gas models with vari- 
ous kinds of disorder have been considered in connection to glasses |23|, |24|, |2j| and granular 
matter M M M M MM. 
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Denoting the grand canonical average as 

(/(f))^-- 1 ]T expL^TxJ X (x)f(x) (10) 

{xi=0,l} \ i I 

we can calculate the average occupation density 

u(u) = —(y Xi ) u = -^-—, (11) 

i 

and the corresponding entropy density is given by a Legendre transform of In 5, 

S (^))=(l-M|)^, (12) 

where the thermodynamic limit N — > oo is implicitly assumed. The entropy of vertex covers 
of cardinality xN thus reads 

Svc(x) = s(l - x) (13) 

Minimal vertex covers correspond to densest particle packings. Considering the weights 
in (^]), it becomes obvious that the density p(/i) is an increasing function of the chemical 
potential. Densest packings, or minimal vertex covers, are thus obtained in the limit /i — > oo: 

x c (c) = 1 - lim . (14) 

^ — >oo 

4 Numerical methods 

Before explicitly following this strategy in the special case of random graphs, we are going 
to present our numerical methods. So we can later-on directly compare all analytical results 
to numerical data. 

All numerical results were obtained by exact enumerations. For large average connectivi- 
ties c > 4 a branch-and-bound algorithm was applied, while for small average connectivities 
a divide-and-conquer technique is more appropriate. Since some readers may not be famil- 
iar with combinatorial optimization algorithms, the methods are explained in detail. Before 
presenting the two procedures, we first introduce a fast heuristic, which is used within both 
methods. The heuristic can applied stand-alone as well. In this case only an approximation 
of the true minimum vertex cover is calculated, which is found to differ only by a few per- 
cent from the exact value. All methods have been implemented via the help of the LEDA 
library |}2) which offers many useful data types and algorithms for linear algebra and graph 
problems. 

The basic idea of the heuristic is to cover as many edges as possible by using as few 
vertices as necessary. Thus, it seems favorable to cover vertices with a high degree. This 
step can be iterated, while the degree of the vertices is adjusted dynamically by removing 
edges and vertices which are covered. This leads to the following algorithm, which returns 
an approximation of the minimum vertex cover V vc , the size \V VC \ is an upper bound of the 
true minimum vertex-cover size: 

algorithm min-cover(G) 
begin 

initialize V vc = 0; 

while there are uncovered edges do 
begin 
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take one vertex i with the largest current degree di] 
mark i as covered: V vc = V vc U {?}; 
remove all incident edges from E; 

remove vertex i from V; 
end; 

return(Kc); 
end 

In Fig. [j] a simple example is presented, where the heuristic fails to find the true minimal 
vertex cover. First the algorithm covers the root vertex of degree 3. Thus, additionally 
3 vertices have to be subsequently covered, i.e. the heuristic covers 4 vertices. But, the 
minimum vertex cover has only size 3, as indicated in Fig. [j]. 

So far we have presented a simple heuristic to find approximations of minimum vertex 
covers, which will be part of the exact algorithms, which we have been applied to obtain all 
numerical results presented in this work. Next, two exact algorithms are explained: divide- 
and-conquer and branch-and-bound. 

The basic idea of both methods is as follows: as each vertex is either covered or uncovered, 
there are 2 N possible configurations which can be arranged as leafs of a binary (backtracking) 
tree. At each node, the two subtrees represent the subproblems where the corresponding 
vertex is either covered ("left subtree") or uncovered ("right subtree"). Vertices, which have 
not been touched at a certain level of the tree are said to be free. Both algorithms do not 
descent further into the tree, when a cover has been found, i.e. when all edges are covered. 
Then the search continues in higher levels of the tree (backtracking) for a cover which has 
possibly a smaller size. Since the number of nodes in a tree grows exponentially with system 
size, algorithms which are based on backtracking trees have a running time which may grow 
exponentially with the system size. This is not surprising, since the minimal- VC problem is 
NP-hard, so all exact methods exhibit an exponential growing worst-case time complexity. 

To decrease running time, both algorithms make use of the fact, that only full vertex 
covers are to be obtained. Therefore, when a vertex i is marked uncovered, all neighboring 
vertices can be covered immediately. Concerning these vertices, only the left subtrees are 
present in the search tree. 

The divide-and-conquer []33| approach is based on the fact that a minimum VC of a 
graph, which consists of several independent connected components, can be obtained by 
combining the minimum covers of the components. Thus, the full task can be split into 
several independent tasks. This strategy can be repeated at all levels of the backtracking 
tree. At each level, the edges which have been covered can be removed from the graph, so the 
graph may split into further components. As a consequence, below the percolation threshold, 
where the size of the largest components is of the order O(lnTV), the algorithm exhibits a 
polynomial running time. Summarizing, the divide-and-conquer approach reads as follows, 
the given subroutine is called for each component of the graph separately, it returns the size 
of the minimum vertex cover. Initially all vertices have state free: 

algorithm divide_and_conquer(G) 
begin 

take one free vertex i with the largest current degree di ; 

mark i as covered; comment left subtree 
sizei := 1; 

remove all incident edges {i, j} from E; 

calculate all connected components {Cj} of graph built by free vertices; 
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for all components Cj do 

size\ := size\+ divide_and_conquer(Ci); 

insert all edges which have been removed; 

mark i as uncovered; comment right subtree; 
size2 := 0; 

for all neighbors j of i do 
begin 

mark j as covered 

remove all incident edges {j, k} from E; 
end 

calculate all connected components {Ci}; 
for all components Cj do 
size2 '■= size2+ divide_and_conquer(Ci); 
for all neighbors j of i do 

mark j as free 
insert all edges {j, k} which have been removed; 

mark i as free; 

if sizei < size2 then 

return(s«2;ei); 
else 

return(size2); 

end 

The algorithm can be easily extended to record the cover sets as well or to calculate the 
degeneracy. In Fig. ^ an example of the operation is given. The algorithm is able to treat 
large graphs deep in the percolating regime. We have calculated for example minimum vertex 
covers for graphs of size N = 560 with average connectivity c = 1.3 

For average connectivities larger than 4, the divide-and-conquer algorithm is too slow, 
because the graph only rarely splits into several components. Then a branch-and-bound 
approach |35|, [56| is favorable. It differs from the previous method by the fact that no 
independent components of the graph are calculated. Instead, some subtrees of the back- 
tracking tree are omitted by introducing a bound: This is achieved by storing always the 
size best of the smallest vertex cover found so far (initially N) and recording the number X 
of vertices which have bee covered in higher levels of the tree. Additionally, always a table 
of free vertices ordered in descending current degree di is kept. Thus, to achieve a better 
solution, at most F = best — X vertices can be covered. This means, it is not possible to 
cover more edges, than given by the sum D = Ylf—i d\ of the F highest degrees in the table 
of vertices, i.e. if some edges remain uncovered, the corresponding subtree can be omitted 
for sure. Please note that in the case that some edges are running between the F vertices of 
highest current degree, then a subtree may be entered, even if it contained no smaller cover. 

The algorithm can be summarized as follows below. The size of the smallest covered is 
stored in best, which is passed by reference (i.e. the variable, not its value is passed). The 
number of covered vertices is stored in variable X, please remember G = (V, E): 

algorithm branch_and_bound(G, best, X) 
begin 

if all edges are covered then 
begin 
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if X < best then 

best := X; 
return; 
end; 

calculate F = best — X; D = di; 
if D < number of uncovered edges then 

return; comment bound; 

take one free vertex i with the largest current degree df, 

mark i as covered; comment left subtree 
X:=X + 1; 

remove all incident edges {i,j} from E; 
branch_and_bound(G, best, X); 
insert all edges {i,j} which have been removed; 
X := X - 1; 

if (X > number of current neighbors) then 
begin comment right subtree; 

mark i as uncovered; 
for all neighbors j of i do 
begin 

mark j as covered; 
X:=X + 1; 

remove all incident edges {j, k} from E; 
end; 

branch_and_bound(G, best, X); 

for all neighbors j of i do 
begin 

mark j as free; 
X := X - 1; 
end; 

insert all edges {j, k} which have been removed; 
end; 

mark i as free; 
return; 
end 

For every calculation of the bound one has to access the F vertices of largest current 
connectivity. Thus it it favorable to implement the table of vertices as two arrays V\,V2 
of sets of vertices. The arrays are indexed of the current degree of the vertices. The sets 
of the first array v\ contains the F free vertices of largest current degree, while the other 
array contains all other free vertices. Every time a vertex changes its degree, it is moved 
to another set, and eventually even changes the array. Also, in case the mark of a vertex 
changes it may be entered in or removed from an array, possibly the smallest degree vertex 
of v\ is moved to v-i or vice versa. Since we are treating graphs of finite average connectivity, 
this implementation ensures that the running time spent in each node of the graph is almost 
constant. For the sake of clarity, we have omitted the update operation for both arrays from 
the algorithmic presentation. 

Although our algorithm is very simple, in the regime 4 < c < 10 random graphs up to 



8 



size N = 140 could be treated. It is difficult to compare the branch-and-bound algorithm 
to state-of-the-art algorithms |36], [37) because they are usually tested on a different graph 
ensemble where each edge appears with a certain probability, independently of the graph 
size (high-connectivity regime). Nevertheless, in the literature usually graphs with up to 
200 vertices are treated, which is slightly larger than the systems considered here. But our 
algorithm has the advantage that it is easy to implement. Furthermore, it can be easily 
modified to study more general questions, see jl^] 



5 Replica approach 



After having introduced our numerical methods, we go back to the statistical-mechanics 
approach displayed in section |3|. The main problem in handling the grand partition function 
(|^) is caused by the disorder due to the random structure of the underlying graph, i.e. of 
the edge set E. To calculate typical properties we therefore have to evaluate the disorder 
average of In 3 over the random graph ensemble. This can be achieved by the replica trick 
& ^_ 

hTS = lim - - 1 (15) 

n— >0 n 

where the over-bar denotes the disorder average over the random-graph ensemble. Taking n 
to be a positive integer at the beginning, we may replace the original system by n identical 
copies (including identical edge sets). In this case, the disorder average is easily obtained, 
and the n — » limit has to be achieved later by analytically continuing in n. We may thus 
write, with n being a natural number, 



i> U5>? n na 

{x'} \ i,a I {ij}£Ea=l 



(16) 



with a denoting the replica index which runs from 1 to n. Putting edges independently with 
probability c/N results in 




£*? n 



i<3 



1 - — + — TT(1 - as? af?) 



£ exp ( M £ xt - ^ + JL £ JJ(1 - + Oil) 



1,3 a 



Following the ideas of |3j| , we introduce 2" order parameters 



(17) 



(18) 



as the fraction of vertices showing the replicated variable £ £ {0, 1}". The exponent in the 
last line of (|l7]) obviously depends only on this quantity. Using Stirlings formula for the 
number N\/J\g(c(£)N)\ of configurations of the {a:°} having the same c(£), we find 

W=( Vc(0 exp J N I - ]T c(0 In c(|) - | + M E + \ E c (^) c (0 " ^°C°) I 

(19) 
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where the integration is over all normalized distributions c(£), i.e. X)ff c (£) = 1- I n the 
large-iV limit, the integration can be solved by the saddle-point method. The saddle-point 
equation can be obtained by variation of the exponent in (|l9|) with respect to all allowed 

c(o = exp|-A+/i^e°+c^c(c)n(i-rnj ■ (20) 

A is a Lagrange multiplier introduced in order to guarantee the normalization of c(£). For 
n — > 0, it will tend to the connectivity c. Before we can however calculate this limit, we have 
to introduce some ansatz for c(£) as even the dimensionality of c(£) still depends on n. In 
the next section, we are going to use the simplest-possible, i.e. the replica-symmetric ansatz. 
As this ansatz is found to be valid only for a finite connectivity range, we also include one 
step of replica-symmetry breaking in the over-next section. 



6 The replica-symmetric solution 
6.1 The replica-symmetric ansatz 

As already explained, we are now using the so-called replica-symmetric ansatz, which in our 
case assumes that the order parameter c(|) depends on if only via ^2 a £, a , cf. also |39| 
In this case we are able to write 

with P(h) being a probability distribution to guarantee the normalization of c(£). The 
physical interpretation of P(h) is simple: Take any vertex i, then its average local occupation 
number (xi)^ in the presence of the chemical potential \x can be written as e hi /(l + e *) 
using an effective chemical potential hi accounting for all interactions on i. P(h) can now be 
constructed as the histogram of these effective chemical potentials. 

Plugging this ansatz into equations (|l9| ) and (|2C| ) , the replica number n appears as a mere 
parameter, and the limit n — > can be calculated. Details of this calculation are given in 
appendix |a|. The saddle-point equation ( po| ) now reads 

J dh P(h)e hy = exp |-c + fiy + c J dh P(h)(l + e h )- y ^ (22) 

and has to be valid for arbitrary y. According to equations (|l^Jl^) we find the entropy 
density of vertex covers using a fraction x = 1 — j dh P(h)/(l + e~^of vertices 



si « ., = / ^^ e ^P FT (k)[l - lnP FT (k)} HI + e h ) 



■ / dhi dh 2 P ( h^ P(h 2 ) In ( 1 - — 



(23) 



where 



P FT (k) := J dh e- lkh P(h) (24) 
denotes the Fourier transform of P(h). 
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6.2 Size of minimal vertex covers 

It is however to complicated to directly solve ( p2| ) for arbitrary [i. But we are interested 
in the properties of minimal vertex covers, which, according to section || can be described 
by the limit /j, — > oo of infinitely large chemical potential. In this case, we also expect the 
effective chemical potentials h to scale as zfi, with z being a random variable with finite mean 
and variance. The rescaled probability distribution is denoted by P(z). Please note that a 
negative z now corresponds to vertices having always Xi — 0, whereas positive z indicate 
vertices with fixed Xi — 1. All vertices being occupied in some ground-states and empty 
in others are collected in z — 0. This picture has to be refined for the calculation of the 
vertex-cover entropy in section |6.4| : There also contributions of 0(/j, ) have to be taken into 
account. For the present purpose the dominant terms are however sufficient. To obtain a 
well-defined limit /i — > oo of eq. (^2|), we also have to rescale y by k := fj,y. Thus eq. (^) 
becomes 

zfe ,„ J „ , _ , „ I I J„ 6f_\ i / J~ b/~\„-zk 



dz P{z)e ZK = exp j-c+z + c^y dz P{z) + J dz P(z)e~ ZK j j . (25) 

The interpretation of this equation in terms of the cavity approach, see |p8| , becomes evident 
if we Fourier-transform it and develop the last part of the exponential on the right-hand side, 

oo d 

P(z) = E e_C ^i S(- + l)*P- d (-) (z) , 



i=0 



r+0 

P_(z) = S(z) dz P(z) + Q(-z)P(-z) . (26) 

J — oo 

* denotes the convolution product. This equation describes the effective chemical potentials 
of a vertex which is the linear superposition of the exterior chemical potential 1 • fi and the 
contributions of all neighbors. The contribution P_ (z) of a neighbor depends on the effective 
potentials the neighbors would have without the presence of the central vertex, for details 
on this cavity interpretation see |Q, and reflects the hard-sphere condition. A neighbor 
with positive potential would be occupied, X{ = 1, and thus forces a negative field for the 
central term. Neighbors with non-negative chemical potential do not impose anything as they 
would have Xi — 0. At the end, the resulting distribution is averaged over the connectivity 
distribution of random graphs. 

This saddle-point equation has the simple solution 



oo 



1+2 



where the Lambert- W function W(c) is defined as the real solution of 

c = W(c) e w{c) . (28) 

We already mentioned that vertices having negative fields are frozen to Xi — 0, vertices with 
positive fields to Xi — 1 . At the present level, we are however not able to calculate the average 
magnetization of the vertices belonging to z = 0. This will be done in the next section. Here 
we only use the result: Half of [z — 0)-vertices are occupied, half are empty. We therefore 
find an average occupation density of hard spheres, 



1 x 2W(c) + W(c) 2 
u(ii^oo) = -Q2x i ) ttr + 00 = y ' , (29) 
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which translates to a minimal vcrtcx-covcr size given by 



2W(c) + W{cf 

x c (c) = 1 — . (30) 

In figure [|, this result is compared to numerical simulations. Extremely good coincidence is 
found for small connectivities c. For non-percolating graphs, i.e. for c < 1, our result was 
recently proven to be exact Q . 

Systematic deviations show up later. For large c, eq. ( |30| ) even violates the bounds 



given in section 2.3 and the exactly known asymptotics (|6|). As we will see later, this can 
be explained within our approach: Replica symmetry breaks down at c = e ~ 2.718, see 
the following sections. Up to this value however, we expect the replica-symmetric result to 
be exact. This is astonishing, as the solution does not show any particular signature of the 
percolation transition of the underlying random graph at c = 1 . 

6.3 The backbone 

The distribution P(z) contains much more statistical information on minimal vertex covers 
than simply its size. One important effect is a partial freezing which can be observed: There 
are vertices, which are always uncovered (x, = 1) in all minimal vertex covers, others are 
always covered (a>i — 0). We call these spins uncovered (resp. covered) backbone. The 
fractions of vertices belonging to these two backbone types are given by 

h (A - W ^ C) 

®uncov\C) 



C 

W(c) + W(cf 

bcov(c) = 1 . (31) 

c 

The remaining W(c) 2 N/c vertices are unfrozen, their covering state changes from ground 
state to ground state. These different freezing properties can already be seen in simple finite 
graphs: A graph consisting of two connected vertices has two minimal vertex covers, and 
the state of the two vertices is not uniquely determined. They thus do not belong to the 
backbone. The situation is different for graphs with three vertices and two edges. The central 
vertex is covered in the unique minimal vertex cover, thus belonging to the covered backbone. 
The other two vertices form the negative backbone. 

Let us now investigate the influence of the close environment of a vertex on its behavior, 
more precisely the influence of its connectivity. The total connectivity distribution is given 
by the Poisson law ([!]), but we can distinguish three distinct contributions: 

• The joint probability P(d, (x) — 1) that a vertex has connectivity d and belongs to the 
uncovered backbone of all minimal vertex covers. 



P(d, (x) = 0) gives the probability that a vertex has connectivity d and belongs to the 
covered backbone of all minimal vertex covers. 



• The remaining vertices are not in the backbone, thus being described by P(d, < (a;) < 
I)- 

These quantities can be easily computed from P(z): according to the interpretation of the 
self-consistent equation ( p6|) we can calculate the effective-field distribution for a vertex of 
connectivity d which, in average, has typical neighbors: 

P d (z)= [«*(■ + l)*P_* d (.)l (z) (32) 
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where P-(z) is exactly the quantity given in (12(f). Plugging our solution (|2^ ) into this 
equation, we find 

P(d, (x) = 1) 

P(d,0 < (x) < 1) 

P(d, (x) = 0) 

The results for c = 2 are displayed in Fig. |5j along with numerical data for N = 17, 35, 70. 
Please note that the numerical results seem to converge towards the analytical one, thus 
showing an excellent coincidence of both approaches. The curves are easily understood: A 
vertex of connectivity has no neighbors. Therefore, it does not appear in any optimum 
cover and we obtain P(0, (x) = 1) = Po c (0), P(Q, (x) < 1) = 0. With increasing connectivity 
the probability that a vertex is covered increases, thus the contribution of P(d,0 < (x) < 1) 
to Po c (d) increases as well. For large connectivities it is very probable that a vertex belongs 
to all VCs but even a finite fraction of vertices with (x) = 1 remains. These results justify 
a posteriori the application of a greedy heuristic within the algorithm: Vertices having large 
connectivity are at first included into the cover set. 



P d {z > 0)Po c {d) 
P d (z = 0)Po c (d) 

P d (z < 0)Po c {d) 



W(c)] d 



_ c W(c)[c-W{c)] d - 1 

_ c [c + (d - l)W(c)][c - W(c)] d - 1 



(33) 



6.4 Entropy of minimal vertex covers 

In order to calculate the entropy of minimal vertex covers, we have to go beyond the leading 
terms in the effective chemical potentials. If we consider e.g. the non-backbone spins, the 
order p, of the effective fields is vanishing, but the order p° determines the actual average 
occupation. We thus have to decompose the effective potentials according to h = pz + z, and 
write the order parameter as 

^v/^pi, ^*^;" (34, 

where both z and z stay finite in the limit fi — > oo. In this sense, we have J dz P(z, z) = P{z), 
the effective distribution calculated in (|27|). We thus write (for /x — > oo) 

oo 

P{z,z) - J2in6(z + l)p®(z) 
i=-i 

Pl = JTTT)Tc (35) 

where p^> (z) describes the still unknown sub-dominant contributions to the effective potential 



— fil. Plugging ansatz (35) into equation ([19]) for the grand partition function, we see that 
the dominating part in In 2 is linear in /i, but vanishes finally once the saddle-point condition 
is used. As is shown in appendix [b|, the term of 0(^°) can be calculated and leads to the 
entropy of minimal vertex covers, 

svc{xc{c)) = J J PFT(k, k) [1 - lnP FT (fc, k)] $(z, z) 

+ ^ P \ fdzi dz 2 p^Hh) P ( - 1) {z 2 ) ln{e-~ Zl +e-^) 
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+cpo Pl J dz p (0) (z) In (\ - 1+ 1 e _ £ ) 

+£pg / dlk dz 2 „<°>(fc) p^(z 2 ) In (l - (1 + e _, l) 1 (1 + e _ 52) ) (36) 

where Pft{^, k) signifies the two-dimensional Fourier transform of distribution P(z, z), and 

[ if z < 

<5>{z,z)=l ln(l + e 2 ) if z = Q (37) 
{ z if z > 

is the O(/x )-term in ln(l + e pz+z ). The corresponding saddle-point equations for the densities 
p( l \z), which are also calculated in appendix [b], read 

p^\k) = exp j-cpo + c Po J dz p^iz) (1 + eV* 

P%{k) = P ( F r\k) P ( F T ] {-~k) l+1 ■ (38) 

We now easily see that pW) is an even distribution, and exactly half of the non-backbone 
vertices are covered in all minimal vertex covers. Within the region of validity of the replica- 
symmetric ansatz, this result is also verified numerically, see figure ^. Our argument for 
deriving equation (|3^) for the minimal vertex-cover size is thus completed. Using these 
saddle-point equations, we can eliminate all but one p^ in the entropy (|36|). After a lengthy 
calculation which is again delegated to the appendix, we finally get 

svc{x c {c)) = | f^e iS ~ k pP T (k)[l- In pP T (k)} ln(l + e J ) 



J d ~ Zl dz 2 P W(h) P i0) (h) 7f 



0(1 + e~ 2 



(39) 



which formally equals the expression ( |23| ) for the vertex-cover entropy for finite chemical 
potential p, with c replaced by 2cpo = 2W(c) 2 . The main difference is however the self- 
consistent equation 

pf T (k) =exp|-2cpo + cp o y dzpW(z) [(1 + e 5 )** + (1 + e 2 )-'*] j (40) 



which can be obtained from ( p9[ ) by optimization with respect to all even distributions p(°\z). 
We are however not able to solve the last equation, and are therefore restricted to variational 
approaches similar to Q| . A first upper estimate would be 

s vc (x c (c)) ~^ln2 + ^ In 1 (41) 

resulting from p^ar = S(z). This results can be slightly improved by using a Gaussian 
variational ansatz for p^°\ but the difference is only up to about 1%. For a comparison with 
numerical results see figure [| 

From equation jicj) w e are also able to read off analytically some limitations of the replica- 
symmetric solution (27). By developing (f4(i|) two second order in fc, we find 



A 2 := / d~zp^\~z)~z 2 

= 2cp rf5p(°)(z){ln(l + e- 5 ) 2 +ln(l + e 2 } . (42) 
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Reseating z — Az, we find 

i = r dz ^ M 1 + e_Az ) 2 + in (! + eA ") 2 } ( 43 ) 

A Jo 

with p(z) = p(°'(z/A)/A being of unit variance. For any p, the right-hand side is an 
monotonously decreasing function of A ranging from +00 for A = to cp = W(c) 2 for 
A — ► 00. Identity ( ff3| ) can thus be satisfied if and only if W(c) 2 < 1, which is valid for 
c < e ~ 2.718. We thus have to conclude that our replica-symmetric solution (|27j ) becomes 
inconsistent beyond average connectivity e, which is again in perfect agreement with the 
systematic deviations of numerical data beyond this point, cf. figures || and |[ Note however, 
that this point is far beyond the percolation threshold of the random graph. After the next 
subsection we will come back to this point, and consider more involved replica-symmetric 
and one-step broken saddle points. 

6.5 The structure of the non-backbone subgraph 

Before doing this, we will complete the discussion of the structure of minimal vertex covers 
in the region < c < e, where the described solution is expected to be exact. In this 
subsection we will concentrate on the structure of the non-backbone subgraph, i.e. the 
subgraph composed of all vertices which are not in the backbone, and all edges from E 
connecting these vertices. 

The first intuition on the structure of the non-backbone component can be drawn from 
the saddle-point equation ( |26| ) for the distribution of effective chemical potentials. We con- 
sider an arbitrary vertex, and call the graph reduced which is obtained from the original 
graph by deleting the considered vertex and all its incident edges. According to the cavity 
interpretation of (|2^) , the vertex is not in the backbone iff exactly one of its neighbors would 
be in the uncovered backbone of the reduced graph. The meaning of this becomes evident 
if we consider the non-backbone graphs in figure [?]: Take e.g. the graph consisting of four 
vertices and three edges. All its vertices belong to the non-backbone. Deleting a boundary 
vertex, the reduced graph becomes backbone, in particular the neighbor of the boundary 
vertex belongs to the uncovered backbone. Deleting instead a central vertex, the reduced 
subgraph becomes disconnected into an isolated vertex, being uncovered backbone, and a 
connected vertex pair, being non-backbone. 

Iterating this argument, we conclude that the non-backbone can be partitioned into p$N/2 
pairs of vertices, every pair containing an edge and being eventually connected to other pairs 
or to covered backbone vertices. The supplementary edges connecting different pairs are 
conjectured to be drawn randomly and independently with the original probability c/N 
between any non-backbone vertices. 

Even if we are not able to prove this conjecture, we may give strong arguments to support 

it: 

• Looking at the non-backbone subgraphs of small tree-like graphs, the predicted struc- 
ture is found. A cluster expansion up to connected clusters of four vertex pairs provides 
lower and upper bounds for the entropy which are in good agreement with numerical 
findings (e.g. in the first four non-zero digits for c = 0.1). 

• We can apply the statistical-mechanics formalism to a restricted random graph ensem- 
ble having exactly the properties described above. This directly leads to expressions 
fl39| ) and ([lO]) for the entropy and the effective-potential distribution. 
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• The proposed structure results in an even distribution of effective potentials for con- 
nectivities c < e, whereas the average occupation density is expected to exceed 1/2 for 
c > e. This is verified numerically, see figure ||. 

• The average connectivity of a vertex pair to other vertex pairs in the restricted ensemble 
is 2cpo = 2W / (c) 2 , the percolation threshold would therefore be at W(c) — 1/V2, i.e. at 
c = cxp(l/v / 2)/v / 2 ~ 1.434. We have checked this numerically by calculating the frac- 
tion of non-backbone vertices in the largest connected component of the non-backbone 
subgraph, see Fig. |[ This quantity clearly extrapolates to for connectivities below 
the percolation point, and saturates at a finite value for larger connectivities. The rea- 
son why this transition is shifted to higher connectivity compared to graph percolation, 
becomes obvious by considering the action of covered backbone vertices. They "cut" 
the graph into smaller pieces. Please remember also that high-connectivity vertices are 
more frequently found in the covered backbone, making this cutting mechanism more 
effective. 

• We should add the remark that we have performed a similar numerical study for the 
backbone subgraph. We found that the percolation threshold of the backbone subgraph 
is identical to the percolation threshold c = 1.0 of the whole graph. 

The percolation however does not bother the validity of the replica-symmetric result, 
which is valid even for percolated non-backbone subgraphs as long as c < e. The proposed 
structure also allows for a very simple interpretation of approximation (|4l|) of the entropy 
of minimal VCs: An isolated pair contributes an entropy In 2 as it has two possible minimal 
VCs, thus explaining the first term in (fll]). This entropy is decreased by the insertion of 
supplementary edges. The simplest structure are chains of four vertices, every one having only 
three minimal VCs - leading directly to the second term in (^lj) as two pairs are connected 
with probability Ac/N . More complicated non-backbone graphs lead to corrections, and may 
be included by a non-trivial . 



6.6 Unphysical replica-symmetric saddle points 



We have seen, that solution (|27|) of the replica-symmetric saddle-point equation ( |26[ ) is correct 
only up to average connectivity c = e. Before searching for replica-symmetry-broken saddle 
points, we should however exclude the existence of further replica-symmetric saddle points. 
We therefore consider again equation (|2^). It is consistent with any ansatz 

Pm(z)= f) + ±) . (44) 



l=—m 



One can easily write down the self-consistent conditions for the probabilities pf , and find 

out that, for m > 1, these have non-trivial solutions with positive p\ a only for c > e. We 
will show this explicitly only for m — 2. The saddle-point equations read 



P-2 = exp{- 


-c{p- 2 - 




-l)} 


p_i = exp{- 


~c{p~2 ' 




-i)}cp-i 


po = exp{- 


-c(p-2 ~ 




-l)}(cp-2 + ~p\ 



(45) 
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The only solution with non-zero p-\ is 



P-2 

P-i 
Po 



c 

lnc- 1 



2 + (lnc- If 
2c 



(46) 



p_l is obviously positive only if c > e. 

The corresponding threshold x c (c) would be larger than the old one resulting from m = 1, 
which is correct compared to the systematic deviation of the numerical data. The multi- 
peak solutions (Q) are however unphysical due to the existence of effective potentials of 
e.g. the value fx/m. This positive potential would force the corresponding vertex to be in 
the uncovered backbone for large u y but the only physical mechanism for this is given by 
the global chemical potential a. The influence of neighbors results only in negative or zero 
potentials. Positive fractions of /x are consequently unphysical. 

We can however interpret multi-peak solutions as a kind of hidden replica-symmetry 
breaking. This will become clear in the following section. 



7 The simplest one-step replica symmetry broken solu- 
tion 

This section is dedicated to the appearance of replica-symmetry breaking (RSB) in VC. 
Despite several efforts, the question of how to handle RSB in finite-connectivity systems is 
still open. Most attempts |^l|, [f3| [f4| try to apply the first step of Parisi's RSB scheme 
f38| which however was originally developped for infinite-connectivity spin glasses. Due to the 
more complex structure of the order parameter a complete solution is however still missing. 
Recently, based on the connection to combinatorial optimization, the interest in this question 
was renewed |p9| , and some promising approximation schemes |?], ^5) have been developed. 
Here we closely follow the approach proposed in |59) which allows to construct a simple 
one-step RSB solution. 

In case of one-step RSB, the full permutation symmetry of the order parameter corre- 
sponding to the equivalence of all n replicas breaks down. According to Parisi's scheme, 
the replicas can be grouped into n/m blocks of equal size m, where the symmetry is now 
restricted to permutations of replicas within every block, or to permutations of full blocks. 
We therefore introduce a new numbering of replicas by index pairs (a, a), with a = 1, n/m 
denoting the block number, and a = 1, m counting the replicas within block a. Due to the 
described symmetry, the order parameters c(£) thus depend on £ only via the block quantities 
s a = ^ , £ oa , or even more precisely, on the number of blocks having s a = ym, which can 
be described by 

n/m 

v{y) = J2 6 (y- s °/™) • (47) 

a=l 

y stands for the average occupation number of a block and ranges from to 1. Its discrete 
nature present for natural n vanishes in the analytical continuation needed for the replica 
limit n — > 0. 
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Following the cavity-like argumentation of Monasson []39|, the order parameter can be 
expressed as 



c(l) 



J a=l J \ ' - 

J VpV[ P ] expj^ 



dy v(y) In 



dh p(h) 



^hmy 



(1 + e h )' 



(48) 



where V\p\ is a histogram of the local distributions pi(h), which themselves are histograms 
of local effective potentials over all thermodynamically relevant pure states, see |}8| for a 
detailed discussion of this interpretation. In the second line, the analytic continuation in n 
has already been made, m is now considered as usually as a parameter in the interval [0, 1] 
which has to be optimized in the saddle-point solution. The only requirement to v(y) is, that 



/ dy v{y) = — 
Jo m 







vanishes in the replica limit n — ► 0. 



This ansatz can be plugged into the saddle-point equation (20) 



(49) 



aa £ aa 

Proceeding term by term on the right-hand side, we find 



^ r « = ^ s a =m j dyv 



(y) y 



(50) 



(51) 



and 

£ aa 



v P v[p\ n / dhp(h) n 

a=l ^ a=l 

n/m 

v P v[ P ] n y d/i p(^) 



C=o,i 



2^(1 -^ aa C) 



y v p "P[p\ cx p |y rf y v{v) ln 



d/i p(/i) (1 



Plugging this results together with equations (|4j) and ( |5l| ) into (50), we obtain for n = a 
closed equation for V[p] which has to be fulfilled for every v(y) satisfying condition (^9|). 

This saddle-point equation is still valid for any chemical potential. In the limit of minimal 
vertex covers, i.e. for p — > oo, this equation simplifies again. For the p(h) we assume an 
ansatz similar to the replica-symmetric value for P(h) in ( |27| ) 



(52) 



p(h)=Lu tl J2p^ mW/2s ( h + P l ) 



where the support of p is now restricted by ^-values between (Z_,Z+) with Z_ > — 1. This 
Z-intervall changes from instance to instance drawn from V [p] . The normalizing prefactor w M 
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becomes irrelevant for n — > due to condition ( [l9|) . The exponential factor is inspired by its 
appearance in infinite-connectivity models, cf. |39j. Please note that the replica-symmetric 
case can be obtained by I- = l+- Introducing weights T'i_j + as the integrated weight of all 
p(h) having the same (7_, l+), the order parameter simplifies to 

lim c[c//i] = "Pl-,l+ exp(^_/_ + u+l+) (53) 

u— »oc L — » 

-l<i_</ + 

where ^_ = to ^^dy v{y) (1/2 — y) and v + = m J^ /2 dy v{y) (1/2 - y). Details of this 
calculation are delegated to appendix |^. Our saddle-point equation thus becomes 

exp(i/_/_ + v+l+) = cxp {-c - + + cV-L-ie"-^ + cV-i fi e u -} 

-i<i-<i+ 

(54) 

which has to be fulfilled for all z/_ , v + . Please note that the m-dependence is completely 
disappeared |46jj. This equation can be easily solved: 



c 

ln(c) 



c 

J, 

-p l — -\-2q~y 

(I- + 1)1(1+ - Z_)! 
(ln(c) - 1)'+-'- 



Pi-,U = r, , tttt; ; — —"P-i^-i V- 

(55) 



(I- + 1)!(Z+ - l-)\c 



Let us discuss this solution: 



• At first we realize that "P_i,o is positive only for connectivities c > e. This is consistent 
with our previous finding that replica symmetry is restricted to smaller c. 

• Introducing pi as the sum over all Vi_j + having I = l_ +l + , saddle-point equation ( |55| ) 
reduces for v_ = i/+ to equations ( |45"| ) for the unphysical replica-symmetric saddle point 
showing half-integer valued effective potentials. This underlines the interpretation of 
these solutions as hidden-RSB solutions. 

• As we do not know the non-backbone magnetization in the RSB solution, we are only 
able to give lower and upper estimates for x c (c). The upper one, x c (c) < 1 — V—i t -i — 
V-ifi = 1 — ln(c)/c coincides with the rigorous upper bound of Gazmuri [pof . The 
lower one would be x c (c) > 1 — V—\,—\ — V—ifi — 'P-i.+i — Po,o- Having in mind the 
numerical result, that non-backbone effective potentials have a positive bias, we can 
however conclude x c (c) > l—V-i-\ — V-\fi—'P-\ l J r \/2 — 'Pofi/2 which is slightly better 
than the replica-symmetric result. In figure || both results are nearly indistinguishable, 
so we have omitted the RSB data from the figure. 

• Also the evaluation of the backbone-size is slightly subtle. In principle we would expect 
that backbone vertices have p(h) which are supported either only on positive or only 
on negative fields. This would result in 



fcl = V i 1 = - 



C 
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Due to the existence of the exponential factors in ansatz (|52| ) also Vi_u with ?_ 7^ — 1 + 
lead to average occupation numbers zero and one, and thus contribute to the backbone: 



l->-l;l+>\l-\ 

1 / (l-lnc) 2 \ , , 

= l--(l+lnc+ * 4 j . (57) 

Both values do not coincide with numerical findings, see also figure |^. Probably this 
could be cured by assuming m ~ cf. p6| , instead of m ~ This would remove 
the exponential dominance of fields of largest absolute value for /1 — * 00. We could 
however construct no solution to this case. 

We may conclude that the presented one-step saddle point improves the replica-symmetric 
findings for x c (c), but is still plagued by certain problems. It remains an open question, if 
these problems can be cured already by including a different scaling of m, or if finally more 
than one step of RSB is required. 



8 Conclusion and outlook 

In this paper, we have presented a detailed analysis of size and structure of minimal vertex 
covers on random graphs. In particular, we have calculated the size dependence of mini- 
mal VCs on the average connectivity, and we ahve shown that those VCs are exponentially 
numerous. Many statistical properties, as e.g. the partial freezing into backbone and non- 
backbone vertices, could be characterized. All our results are based on exact numerical 
enumerations as well as replica calculations. We have found that replica-symmetric results 
appear to be exact up to graph connectivities c = e ~ 2.718, whereas replica-symmetry 
breaking has to be included for an understanding of higher-connectivity graphs. This is 
however a complicated task: Even if there has been some recent progress on the question of 
one-step replica-symmetry breaking in finite connectivity systems based on various approxi- 
mation schemes |39| [?], Q , a definite technical approach is still missing. Due to the simplicity 
of its replica-symmetric solution, as compared e.g. to satisfiability problems j5|, vertex cover 
could be a good model for further progress into this direction. 

In our paper, we have only considered finite-connectivity random graphs. These show 
however a very simple geometrical structure. They are locally tree-like, and loops are of length 
O(lniV). It would be interesting to considered therefore restricted graph ensembles which 
include non-trivial local structures. The question of such topological influences on the solution 
structure of combinatorial optimization problems still remains an interesting open question, 
as also other studied problems include mainly locally tree-like problems |Q, ||. Restricted 
graph ensembles could therefore provide a possible starting point for further research. 

A last comment concerns the interpretation of vertex covers as packings of hard spheres on 
random lattices. We were able to describe the maximally dense packings, which were found 
to show very interesting properties due to the disorder present in the graph: There where 
backbone sites having the same occupation state in all densest packings, whereas others are 
found to be free in some packings, occupied in others. This effect resembles the existence of 
blocked and unblocked particles in real packings. With some modifications, the hard-sphere 
lattice gas can therefore be understood as a possible mean-field model of granular packings, 
compare also |Q. Work is in progress along these lines. 
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A The replica-symmetric limit n —> 



Starting from equations < \19\) and (|20|) we will present the calculation of the replica limit 
n — ► under the replica-symmetric ansatz 



C (D = / dhP(h) 



ex P (feS o e) 
(l + e h ) n 



(58) 



The procedure is very similar to the one presented in ]39| for Ising-spin-glass models. We 
start with the grand partition function as given in (|l9|) : 



lim lln 3 = Urn - ( - £ c$ In c(|) - | + /i E <lk° + | E c $ c $ I^ 1 " ^ ) 

(59) 

where c(£) takes its saddle-point value. At first, we consider the combinatorial entropy and 
use again a replica trick: 



X>(|)ln C (D = 

I* 



1?^ 



(60) 



1 = 1 



Assuming positive integer I at the beginning, and plugging in the replica-symmetric ansatz 
for c(£), we write 

= l + n I dhf- dhiP{h x ) ■ ■ ■ P(hi) ln(l + exp{E ^m}) 

TJX 

-nZ y efft P(Zi) ln(l + e' l ) + 0(n 2 ) 
Introducing new variables = X)m=i ^mi the last expression becomes 

E c $' = 1 + nJ dH 1 ---dH l P(H 1 )P(H 2 -H 1 )---P(H l -H l _ 1 )ln(l + e H <) 
f 

-nZ y d/i P(/z) ln(l + e' 1 ) + 0(n 2 ) 
= 1 + nJdHiJ ^-e iH ' k P FT (k) l ln(l + e H ')-nl J dh P(h) ln(l + e h ) + 0{n 2 ) 
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In the last step we have used the fact, that the /-fold convolution of P(h) with itself can be 
express as the Fourier-back transformation of the Ith power of its Fourier transform Pft- 
Now the differentiation with respect to I can be carried out, and according to ( |60|) we find 



f Ah Ah 

^ C (|)lnc(D=n / -^e lhk P FT {k) [-1 + \nP FT {k)) ln(l + e h ) 



(61) 



The other terms in ( |59| ) can be evaluated directly, 

e h {l + e h ) n - 1 



la 



n J dh P{h) 
n I dh P(h) 



(1 + e h ) n 
1 



(1 



P n 

j dh x dh 2 p{h x ) p(h 2 )j2I[ 



dh 1 dh 2 P(fti) P(h 2 ) 



(l-eC Q )exp(feie + ^C a ) 
(1 + e' ll )(l + e' 12 ) 



(1 + e' l 0(l + e^ 2 ) 



1 + n J dh 1 dh 2 P(h x ) P(h 2 )\n 



: ,h 1 +h 2 



1 - 



(1 + e' l i)(l + e' 12 ) 



+ 0(n 2 ) 



Putting these results together, we find 



lim — ln3 = 

Af^oo TV 



dh dk 



1 1 1 k 



P F T{k) [1 - \nP FT (k)} ln(l + e h ) + n dh P(h) 



(1 



-| Jdh x dh 2 P(h!) P{h 2 )\n 



a h 1 +h 2 



1 - 



(1 + e hl )(l + e' 12 ) 



(62) 



which finally results in equation (|23|) for the vertex-cover entropy. 
For the saddle-point equation 

a q a 



(63) 



we proceed analogously. Obviously both side depend on £ only via y = J2 a The left-hand 
side thus simplifies for n — > 0: 



dft P(/i) e 



''■y 



(64) 



(65) 



whereas the right-hand side (r/is) gives 

rhs — exp |— A + fiy + c J dh P(h) ^ - — 

We now can determine the Lagrange multiplier from the normalization of P(h). For y = 0, 
the left-hand side equals one, whereas the right-hand side equals exp(— A + c), which results 
directly in A = c, and thus in the replica-symmetric saddle-point equation (B2I). 

The same saddle-point equation can of course be derived by varying equation (62) directly 
with respect to P(h). Note however that the result given here is stronger: We have shown 
that the original saddle-point equation for c(f;) is closed under our replica-symmetric ansatz, 
thus leading to a real saddle point of the free energy. The second procedure would however 
be important if we would use a variational ansatz which does not close the c(£)-equation. 
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B Calculation of the entropy 

For calculating the entropy of minimal vertex covers, we start again with equation ([l^) for 
the disorder-averaged grand partition function, but now we plug in the refined ansatz j3^), 
%. c 

c(£) = J dzdz P(z, z) (66) 

where P(z, z) is assumed to stay a well-behaved probability distribution in the limit /i — > oo 



of minimal vertex covers. Consistency with the dominant behavior discussed in section 6.2 
requires 

d~zP{z,~z) = PiS(z + l) (67) 
i=-i 

with 

W(c) l+2 

» - ww-c m 



for p, — > oo, cf. (p7|). We therefore may write 

P(z,z)= Y, Pl S(z + l)pW(z) (69) 
i=-i 

with probability distributions pyl (z) which still have to be determined. By plugging ansatz 
d66| ) into In H as given in ( [Tgj ) and following the same procedure as in the last section, we 
find for finite //: 

nm lnS = fdzdk /•^ e i,«H^*p^ (fc jfe ) r i _ ln p^ (fc jb ) l lll(1+e M*+i ) 
JV— >oo l\ J It: J Ztt L J 



+^ / d*d2P(*,*) (1 + e _^ } (70) 



— - y dzidz 2 dz 1 dz2P(z 1 ,z 1 )P(z 2 ,Z2)^ 



1 

1 - 



(1 + e~ tiZl ~ Zl )(l + e~^ Z2 ~ Z2 ) 



with Pprik, k) = J dz dz P(z, z) exp{— izk — izk} being the 2d Fourier-transform of P(z, z). 
For fi — > oo, the dominant behavior seems to be of 0{p) 1 but its coefficient has to vanish 
at the saddle point as the entropy stays finite. This has been checked explicitly, without 
presenting those details we therefore concentrate on the second term of O(fiP) which will give 
the entropy of minimal vertex covers, 

svc{x c {c)) = lim (lim — lnS-/i / dz dz P{z,2) 1 ) . (71) 

Starting with the first term in (|70|), we have to leading orders 

ln(l + e^ +i; ) -► (izQ(z) + $(z,z) 

if z < 

$(z,z) = { ln(l + e 2 ~) if z = (72) 
z if z>0 

At the moment, replacing ln(l + e fJ,z+z ) by <j>(z, z) is all we can do in the first term without 
using the saddle-point equations for the p^s. The situation is better for the last term in 
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|70|j. Having in mind that z can take only integer values smaller or equal to +1, we use 



In 



1 



(1 + e-^i-^Xl + e-^-z*) 



-p + ln(e 21 + e 22 ) if z\ = Z2 = 1 



In 1 - 1+e 1 _g 1 if Zi = 0, z 2 = 1 

! : ;— -; : — ! In 1 ~ 1+e -g 3 if Zi = 1, Z2 = 

ln [ X - (i +e -.i)(i+e-2) ] if z i = z 2 - 
if zi, z 2 < 

(73) 

where all terms are dropped which are exponentially small in p. Plugging this result into 
M), we find 



svc(x c (c)) 



dz dk f dz d P & izk+izk 



; P FT (k,~k) [l-lnP FT (fc,fc)] 



+ - p 2 . / ^ ^ pM)^) pt- 1 )^) ln(e- Zl +e- £l ) 
+cp Pi 1 <25 p^\z) In - y^zi) 

+ ~ 2 P 2 o J ^ & P (0) ^) P l ° } &) ln (l - (1 + e -, l) 1 (1 + e -, 2) ) ( 74 ) 

which is equation (p6|). 

We continue again with the derivation of the saddle-point equation, again we start from 
the original equation for c(£) as given in eq. (|2C|), 



3 (D = ex P ^ -a + + C (c) na - m 

a ^ a 



(75) 



Plugging in the replica-symmetric ansatz and continuing analogously to the previous ap- 
pendix for n — > 0, we find 



j dz dzP{z, ~ z )e» zk + ik = exp j-c + pk + c J dz d~zP(z, z) [l + e^+ 2 ] fc | 



(76) 



For k = 0{p _1 ) in the limit p — > oo, we find back the old saddle-point equation for the 
dominant effective chemical potentials. The saddle-point equations for the sub-dominant 
corrections p^[z) are however obtained for k — O(pP). The corresponding limit p — > oo 
is not obvious due to the existence of terms like pk. We have to use equation (^). The 
left-hand side of the last equation thus reads 



dz dzP(z, z)e 



~\ fizk+zk 



E 

i=-i 



Pie 



-fikl 



d~zp^\z)e ik 



(77) 



The dominant contribution for large p and positive k is given by the term having I — —I, and 
diverges exponentially as e^ k . Multiplying equation ( |7^ ) with e _Mfe thus yields a well-defined 
limit p — ► oo, we find 



Pft^O) = exp j-cpo + cp y dzp {0 \z) (l + e 2 )^ j 



(78) 
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We now proceed by subtracting the dominant contributions ~ e^ k on both sides of eq. (|7^), 
and find for p — ► oo 



where we have used (78) in the last line. Continuing by iteration, we finally find 

p%(k) = p£\k) P ( F r\~k) 1 . 



(79) 



(80) 



So it is very simple to solve all but one of these saddle point equations. We can consequently 
express syc(x c (c)) in terms of p^(z), as is done in eq. (^9|) i n section x4. p^ \z) itself is 
described by (|4C| ) which follows directly from equations (|78| , |79| ) . The corresponding calcula- 
tions are lengthy but straight-forward, so we do not present it here. The only trick which 
has to be used is the following: Using (|o|) we may write 



P F T(k,~k) 



E 



W(c) 



1+2 



ikl U) 



(1 + 1) 



tirik) 



1 



' PftW^ (z + 1) . 



c 

W{c) 



W(c)e ik p& ) (-~k) 



i+i 



tk P ( F l\k) exp [w{ c y k p { -^\-k)] 



(81) 



This expression helps to simplify lnPpr(fc,fc) in equation (|74|). 



C Evaluation of the RSB saddle-point equation 



This last appendix shows how the p — > oo limit can be taken in the one-step RSB saddle-point 
equation. We start with the order paramter as given in (E8h 



/r pi r r e hmy 1 > 

Vp V\p] exp | ^ dy v(y) In [J dh p{h) (1 + e/l)TO j j 



and plug in ansatz (52), 



P (h) = ^j2p^ mW/2s ( h +^) ■ 



(82) 



(83) 



z=z_ 



We assume in particular that p\ ± ^ for uniqueness of the definition of l_ and l + . Setting 
pi = for all I < £_ and all I > /+, we find for the exponent in (82): 



{...} = 



dy v(y) In 



dy v(y) In 



dft, p(h) 



^hmy 



(l + e h)m 



Z>0 



(84) 



where only the dominant contribution in is kept in every term ofj- • •]. can be skipped 
in the last line, because v{y) has to have zero integral due to (|49j). For large p this is 
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exponentially dominated by only one term which depends on y: If y < 1/2 the term with 
/ = l + dominates, for y > 1/2 the Z_-tcrm becomes exponentially larger than all others. 
Introducing 

d v v {y) (o -v) 

1/2 Z 
[1/2 I 

v + = m J dy v{y) (- - y) (85) 

we conclude 

lim = v-l- + v+l+ , (86) 

and 

hm c[v/ti] = V Vu, l+ e»- l -+»+ l + (87) 

n— >oo * — * 

-i<;_<; + 

which is the left-hand side of the saddle-point equation. On the right-hand side, an integral 
similar to ( |S4| ) has to be determined. Following exactly the same scheme as above, we find 
the expression given in equation ( |54| ) . 
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In principle, also a scaling m ~ can be plugged in. This would be analogous to 
infinite-connectivity spin-glass models in the zero-temperature limit, and was also con- 
sidered in the variational approach [Q] to 3-satisfiability. The inclusion into the presented 
formulation leads however to saddle-point equations which could not be solved by the 
authors. 








Figure 1: A small sample graph with minimum vertex cover of size 3. The vertices belonging 
to the minimum V vc are dark. For this graph the heuristic fails to find the true minimum 
cover, because is starts by covering the root vertex, which has the highest degree 3. 
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Figure 2: Example how the divide-and-conquer algorithm operates. Above the graph is 
shown. The vertex i with the highest degree is considered. In the case it is covered (left 
subtree), all incident edges can be removed. In case it is uncovered, (right subtree) all 
neighbors have to be covered and all edges incident to the neighbors can be removed. In both 
case, the graph may split into several components, which can be treated independently by 
recursive calls of the algorithm. 




Figure 3: Phase diagram: Fraction x c {c) of vertices in a minimal vertex cover as function of 
the average connectivity c. For x > x c (c), almost all graphs have covers with xN vertices, 
while they have almost surely no cover for x < x c (c). The solid line shows the replica- 
symmetric result. The circles represent the results of the numerical simulations. Error bars 
are much smaller than symbol sizes. The upper bound of Harant is given by the dashed 
line, the bounds of Gazmuri by the dash-dotted lines. The vertical line is at c = e. Inset: 
All numerical values were calculated from finite-size scaling fits of x c (N, c) using functions 
x c (N) = x c + aN~ b . We show the data for c = 2.0 as an example. 
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Figure 4: The total backbone size b uncov {c) + b cov (c) of minimal vertex covers as a function 
of c. The solid line shows the replica-symmetric result, the dotted ones are the two results of 
one-step RSB. Numerical data are represented by the error bars. They were obtained from 
finite-size scaling fits similar to the calculation for x c {c). The vertical line is at c = e where 
replica symmetry breaks down. 




Figure 5: Distribution of connectivities d for c = 2.0. We show the total connectivity 
distribution, given by a Poissonian of mean c, as well as results describing the minimal vertex 
covers. The total distribution is divided into three contributions arising from the vertices 
which either are not in the backbone (0 < (x) < 1) or which are in the covered/uncovered 
backbone ((x) ~ 0/1). Analytical predictions are represented by the lines (which are guides 
to the eyes only, connecting the results for integer arguments), while the numerical results 
for N = 17, 35, 70 are displayed using the symbols. 
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Figure 6: Entropy of minimal vertex as function of the average con nectivity c. The solid line 
results from the Gaussian approximation described in section 6A while numerical data are 
given by the symbols with error bars. Each numerical result was obtained by an extrapolation 
N — > oo via fitting a function s c {N) = s c + aN^ 13 to the data for each c. The vertical line 
is at c = e. 
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Figure 7: Examples of smallest non-backbone graphs. Note that all graphs can be divided 
into connected vertex pairs and some supplementary edges connecting different pairs. A 
similar structure is found also for the full non-backbone subgraph at connectivities c < e. 
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Figure 8: Numerical histograms of local average occupation numbers (x)^^ of non- 
backbone vertices for average connectivities c = 2.0 and 8.0. The upper distribution is 
perfectly symmetric as predicted by theory for all c < e. The lower one shows an obvious 
bias towards higher occupation. The effect becomes stronger with increasing connectivity. 
Please note also the existence of pronounced peaks in both distributions. These result from 
small non-backbone components or dangling ends of the giant cluster, e.g. the peaks at 
m = 1/3,2/3 appear in chains of four vertices connected by three edges as given by the 
second graph in the previous figure. The weight of these peaks decreases with increasing size 
of the giant non-backbone component. 
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Figure 9: Fraction / max = C max /(1 — b c )N of the largest component of the non-backbone 
subgraph from numerical calculations as a function of graph sizes N up to size N = 560. For 
b c (c) the numerical values were taken. In a double logarithmic plot, for connectivities smaller 
than the predicted threshold c ~ 1.434, the function f max {N) has a negative curvature, 
indicating that / max converges towards zero. Thus, for small connectivities, the non-backbone 
does not percolate. For larger connectivities, f m ax(N) has a positive curvature, and fits of 
the form f(N) = foo + bN c result in stricly positive values, here foe — 0.17(1) (c = 1.6) resp. 
foo — 0.37(3) (c = 2.0). Hence, the non-backbone percolates. 



32 



